Course
Full syllabus, tracks, graduation criteria, and a module-to-practice map.
MCP • Memory • Workflow • Multi-Agent • Production
A complete open-source course for moving from chatbot demos to production-aware agent systems.
Full syllabus, tracks, graduation criteria, and a module-to-practice map.
Minimal Python examples for agents, tools, MCP, memory, workflow, colonies, evals, and RAG.
Lightweight behavior checks for tool use, RAG, workflow, security, and observability.
Instructor-ready 90-minute teaching plans for every module.
4-week, 8-week, and workshop formats for teams and learning cohorts.
Healthcare, finance, and enterprise demos with sample outputs.
Realistic healthcare, finance, and enterprise cases with risk boundaries and evals.
An agent is not magic. It is context, tools, memory, workflow, evaluation, and human approval arranged around a useful task.
Run every dependency-free example and showcase with one command.
Open scriptInspect retrieval, grounded answering, no-answer behavior, and RAG evaluation.
Open exampleRun regression checks that separate correctness, format, and safety behavior.
Open exampleCheck tool use, RAG grounding, approval gates, injection defense, and traces.
Open benchmarksInspect explicit graph transitions, high-risk routing, and approval gates.
Open exampleTrace decisions, tool calls, guardrails, and replayable incident evidence.
Open exampleBlock unsafe instructions from retrieved content before they reach the agent.
Open exampleRoute tasks by budget, latency target, and required answer quality.
Open exampleCheckpoint work, resume safely, and preserve artifacts after interruption.
Open examplePractice tools, resources, prompts, authorization, and elicitation.
Open exampleRedact PII, merge repeated memories, decay confidence, and delete sensitive records.
Open exampleAssign owners, scopes, risk tiers, access reviews, and audit logs to agents.
Open exampleRun regression, safety, adversarial, and golden trace checks before release.
Open exampleUse traces, containment, hotfix evals, and postmortems when an agent fails.
Open playbookDesign approval, evidence, control, recovery, and trust patterns for agent products.
Open checklistRegister, review, monitor, and retire enterprise agents with clear ownership.
Open checklistUse release, v1 readiness, and deployment review checklists before publishing changes.
Open release kitStart with registry, risk assessment, and deployment review templates.
Open templatesStudy agent papers from OpenAI, Google, Meta, Anthropic, Microsoft-adjacent ecosystems, Stanford, Princeton, and Tsinghua.
Open papersRead concise engineering takeaways for ReAct, Toolformer, WebGPT, RAG, Reflexion, Voyager, AgentBench, SWE-agent, and safety papers.
Open notesCompare practical frameworks for LLM app tests, RAG metrics, safety checks, regression gates, and CI evaluation.
Open guideExplore agent frameworks, MCP projects, RAG tools, eval systems, observability, and ops infrastructure.
Open projectsChoose agent frameworks by control flow, state, tools, evaluation, observability, safety, and operations.
Open matrixUse a 30-minute reading loop to extract architecture lessons from open-source agent repositories.
Open guideBuild GitHub-ready agent projects with evals, traces, safety boundaries, and architecture writeups.
Open projectsStart the final project from a runnable colony scaffold with regression evals.
Open starterRoutes health education requests with safety boundaries and escalation rules.
Open demoCompares companies as research support without personalized investment advice.
Open demoClassifies support tickets, routes work, and identifies approval needs.
Open demoStudy realistic safety and evaluation cases for healthcare, finance, and enterprise agents.
Open casebooks